Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mastic aggregator and collector implementation #1107

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

hannahdaviscrypto
Copy link
Contributor

Implements preparation, aggregation, and unsharding for the Mastic VDAF, and adjusts the VIDPF and SZK modules in support of this development.

Hannah Davis added 2 commits August 13, 2024 19:06
Implements aggregator and collector functionality for the Mastic protocol for weighted heavy-hitters and attribute-based metrics.
@cjpatton cjpatton self-requested a review August 15, 2024 19:37
Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments through prep_init(). Great start!

src/flp/szk.rs Show resolved Hide resolved
src/flp/szk.rs Show resolved Hide resolved
src/flp/szk.rs Outdated Show resolved Hide resolved
src/flp/szk.rs Outdated Show resolved Hide resolved
src/flp/szk.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
@hannahdaviscrypto hannahdaviscrypto force-pushed the mastic-agg-dev branch 3 times, most recently from e66a7fb to 69749ff Compare September 16, 2024 20:54
src/vidpf.rs Outdated Show resolved Hide resolved
src/flp/szk.rs Outdated Show resolved Hide resolved
src/idpf.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@cjpatton cjpatton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments up to prepare_init().

src/vidpf.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vidpf.rs Outdated Show resolved Hide resolved
src/vidpf.rs Show resolved Hide resolved
src/vidpf.rs Show resolved Hide resolved
}
}

/// Vidpf evaluation state
///
/// Contains the values produced during input evaluation at a given level.
#[derive(Debug)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@divergentdave can you check if Debug should be fenced by a feature flag? In the past we discussed using this pattern to prevent secrets from being logged in production environments. I'm not sure what the state of play is now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently gate any Debug impls on that feature flag. A number of structs currently derive Debug, like Seed, Prio3InputShare, Poplar1InputShare.

src/vidpf.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
src/vdaf/mastic.rs Outdated Show resolved Hide resolved
@cjpatton cjpatton self-requested a review October 30, 2024 22:08
part.encode(bytes)?
};
if let Some(ref part) = self.helper_joint_rand_part_opt {
if let Some(ref part) = self.joint_rand_part_opt {
part.encode(bytes)?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rust convention (AFAIK)

Suggested change
part.encode(bytes)?
part.encode(bytes)?;

pub type SzkQueryState<const SEED_SIZE: usize> = Option<Seed<SEED_SIZE>>;

/// Verifier type for the SZK proof.
pub type SzkVerifier<F> = Vec<F>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we can now delete this type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notably, the two encode/decode implementations are unused, and since they are on a type alias, they would bleed outside of the module to any use of a Vec<T>.

.zip(helper_share.flp_verifier)
{
*x += y;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General control flow tip: Avoid nesting if statements. E.g., instead of

if cond1 {
   do_this();
   if cond2 {
      do_that();
   } else {
      handle_error();
   }
} else {
  handle_error();
}

do

if !cond1 {
   handle_error();
}
do_this();
if !cond2 {
    handle_error();
}
do_that();

This makes code easier to read and maintain.

leader_share.joint_rand_part_opt,
helper_share.joint_rand_part_opt,
) {
(Some(leader_part), Some(helper_part)) => Ok(Some([leader_part, helper_part])),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could already combine the parts into the seed. The advantage is we reduce the amount of bits on the wire by half. (Plus, IIRC, this is what the current spec says to do.)

} else {
Err(SzkError::Decide("failed to verify FLP proof".to_string()))
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}

}
}

fn aggregate<M: IntoIterator<Item = Self::OutputShare>>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convention: Use the concrete type here

Suggested change
fn aggregate<M: IntoIterator<Item = Self::OutputShare>>(
fn aggregate<M: IntoIterator<Item = MasticOutputShare<T::Field>>>(

agg_shares: M,
_num_measurements: usize,
) -> Result<Self::AggregateResult, VdafError> {
let n = agg_param.level_and_prefixes.prefixes().len();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let n = agg_param.level_and_prefixes.prefixes().len();
let num_prefixes = agg_param.level_and_prefixes.prefixes().len();

Comment on lines +693 to +695
for i in 0..n {
let encoded_result = &agg_final.0
[i * self.vidpf.weight_parameter..(i + 1) * self.vidpf.weight_parameter];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use chunks() to get an iterator instead: https://doc.rust-lang.org/std/slice/struct.Chunks.html#example

This is easier to read and allows the compiler to do some more optimization.

result.push(
self.szk
.typ
.decode_result(&self.szk.typ.truncate(encoded_result.to_vec())?[..], 1)?,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've hard coded the number of measurements to be 1. In fact, this is supposed to be the number of measurements that have the prefix in common. See https://github.com/jimouris/draft-mouris-cfrg-mastic/blob/482903a99a52cfb61ec06d979ade2fc0512e8c3f/poc/mastic.py#L351.

Rather than address it in this PR, I suggest leaving it as a TODO for #947 and taking care of it later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we need to add the counter prefix to the encoded measurement, from sharding through to here. Once that's addressed in a follow-up, I think we'll want convenience constructors for Mastic with particular circuits, to take care of constructing the Szk and Vidpf objects, and calculating the VIDPF weight parameter from szk.typ.output_len() + 1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that the spec currently calls truncate during preparation, before producing an output share, as opposed to during unsharding.

Comment on lines +728 to +735
let first_input = VidpfInput::from_bytes(&[240u8, 0u8, 1u8, 4u8][..]);
let second_input = VidpfInput::from_bytes(&[112u8, 0u8, 1u8, 4u8][..]);
let third_input = VidpfInput::from_bytes(&[48u8, 0u8, 1u8, 4u8][..]);
let fourth_input = VidpfInput::from_bytes(&[32u8, 0u8, 1u8, 4u8][..]);
let fifth_input = VidpfInput::from_bytes(&[0u8, 0u8, 1u8, 4u8][..]);
let first_prefix = VidpfInput::from_bools(&[false, false, true]);
let second_prefix = VidpfInput::from_bools(&[false]);
let third_prefix = VidpfInput::from_bools(&[true]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this in an array and reference the indices of the array below.

Suggested change
let first_input = VidpfInput::from_bytes(&[240u8, 0u8, 1u8, 4u8][..]);
let second_input = VidpfInput::from_bytes(&[112u8, 0u8, 1u8, 4u8][..]);
let third_input = VidpfInput::from_bytes(&[48u8, 0u8, 1u8, 4u8][..]);
let fourth_input = VidpfInput::from_bytes(&[32u8, 0u8, 1u8, 4u8][..]);
let fifth_input = VidpfInput::from_bytes(&[0u8, 0u8, 1u8, 4u8][..]);
let first_prefix = VidpfInput::from_bools(&[false, false, true]);
let second_prefix = VidpfInput::from_bools(&[false]);
let third_prefix = VidpfInput::from_bools(&[true]);
let prefixes = [
VidpfInput::from_bytes(&[240u8, 0u8, 1u8, 4u8][..]),
VidpfInput::from_bytes(&[112u8, 0u8, 1u8, 4u8][..]),
... and so on
];

Comment on lines +907 to +928
let (public_share, input_shares) =
mastic.shard(&(first_input.clone(), false), &nonce).unwrap();
run_vdaf_prepare(
&mastic,
&verify_key,
&first_agg_param,
&nonce,
public_share,
input_shares,
)
.unwrap();

let (public_share, input_shares) = mastic.shard(&(first_input, true), &nonce).unwrap();
run_vdaf_prepare(
&mastic,
&verify_key,
&first_agg_param,
&nonce,
public_share,
input_shares,
)
.unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use run_vdaf() to run the whole thing end-to-end, including aggregation and collection:

pub fn run_vdaf<V, M, const SEED_SIZE: usize>(

Copy link
Contributor

@divergentdave divergentdave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, of the SZK module:

}
}

/// Vidpf evaluation state
///
/// Contains the values produced during input evaluation at a given level.
#[derive(Debug)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently gate any Debug impls on that feature flag. A number of structs currently derive Debug, like Seed, Prio3InputShare, Poplar1InputShare.

Comment on lines +48 to +51
/// Returned when a user fails to store the length of the verifier so it
/// can be properly read upon receipt.
#[error("Part of Szk query state not stored")]
InvalidState(String),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error variant is no longer used for this purpose. It's now used when the leader share and helper share have a mismatched presence/absence of joint randomness parts. Since that error condition is not related to the prepare state, I think we can remove this InvalidState variant.

}
}

/// Joint share type for the SZK proof.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs some further explanation, since this is a novel term.

Suggested change
/// Joint share type for the SZK proof.
/// Joint share type for the SZK proof.
///
/// This is produced by [`Szk::merge_verifiers`] as the result of combining two query shares.
/// It contains the joint randomness parts from each aggregator, if applicable. It is consumed by [`Szk::decide`].

Or, taking https://github.com/divviup/libprio-rs/pull/1107/files#r1823527498 into account:

Suggested change
/// Joint share type for the SZK proof.
/// Joint share type for the SZK proof.
///
/// This is produced by [`Szk::merge_verifiers`] as the result of combining two query shares.
/// It contains the re-computed joint randomness seed, if applicable. It is consumed by [`Szk::decide`].

if !check_flp_proof {
return Ok(false);
}
joint_share: &[Seed<SEED_SIZE>; 2],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could better encapsulate the SZK logic if this took in a &SzkJointShare instead. This would let us remove the match block from Mastic::prepare_next().

) {
(Some(leader_part), Some(helper_part)) => Ok(Some([leader_part, helper_part])),
(None, None) => Ok(None),
_ => Err(SzkError::InvalidState(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, this error case is no longer about the prepare state, since this just depends on query shares.

@@ -5,19 +5,22 @@
//! [draft-mouris-cfrg-mastic-01]: https://www.ietf.org/archive/id/draft-mouris-cfrg-mastic-01.html

use crate::{
bt::{BinaryTree, Path},
Copy link
Contributor

@divergentdave divergentdave Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove #![allow(dead_code)] etc. from the bt module now that this is using it?

Relatedly, I think we ought to make the bt module public now, since BinaryTree appears in the arguments of the public Vidpf::eval_with_cache() method.

Comment on lines +245 to +252
if cache_tree.root.is_none() {
cache_tree.root = Some(Box::new(Node::new(VidpfEvalCache {
state: VidpfEvalState::init_from_key(id, key),
share: W::zero(&self.weight_parameter), // not used
})));
}

let mut sub_tree = cache_tree.root.as_mut().expect("root was visited");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can combine these and remove the .expect() by using Option::get_or_insert_with()

Suggested change
if cache_tree.root.is_none() {
cache_tree.root = Some(Box::new(Node::new(VidpfEvalCache {
state: VidpfEvalState::init_from_key(id, key),
share: W::zero(&self.weight_parameter), // not used
})));
}
let mut sub_tree = cache_tree.root.as_mut().expect("root was visited");
let mut sub_tree = cache_tree.root.get_or_insert_with(|| {
Box::new(Node::new(VidpfEvalCache {
state: VidpfEvalState::init_from_key(id, key),
share: W::zero(&self.weight_parameter), // not used
}))
});

}
let mut cs = Vec::<VidpfProof>::with_capacity(*bits);
for _ in 0..*bits {
let mut proof = [0u8; 32];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: another place where we can use this constant

Suggested change
let mut proof = [0u8; 32];
let mut proof = [0u8; VIDPF_PROOF_SIZE];

Comment on lines +859 to +868
let mut bytes = vec![];
public.encode(&mut bytes).unwrap();

assert_eq!(public.encoded_len().unwrap(), bytes.len());

let decoded = VidpfPublicShare::<TestWeight>::decode_with_param(
&(8, TEST_WEIGHT_LEN),
&mut Cursor::new(&bytes),
)
.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Using the get_... helper methods can save some boilerplate, and gets us a check for unused bytes for free.

Suggested change
let mut bytes = vec![];
public.encode(&mut bytes).unwrap();
assert_eq!(public.encoded_len().unwrap(), bytes.len());
let decoded = VidpfPublicShare::<TestWeight>::decode_with_param(
&(8, TEST_WEIGHT_LEN),
&mut Cursor::new(&bytes),
)
.unwrap();
let bytes = public.get_encode().unwrap();
assert_eq!(public.encoded_len().unwrap(), bytes.len());
let decoded = VidpfPublicShare::<TestWeight>::get_decoded_with_param(
&(8, TEST_WEIGHT_LEN),
&bytes,
)
.unwrap();

/// Mastic prepare state.
///
/// State held by an aggregator between rounds of Mastic preparation. Includes intermediate
/// state for [`Szk``] verification, the output shares currently being validated, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo

Suggested change
/// state for [`Szk``] verification, the output shares currently being validated, and
/// state for [`Szk`] verification, the output shares currently being validated, and

@@ -226,6 +225,57 @@ impl<W: VidpfValue, const NONCE_SIZE: usize> Vidpf<W, NONCE_SIZE> {
})
}

/// [`Vidpf::eval_with_cache`] evaluates the entire `input` and produces a share of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere, we can drop the name of methods from their documentation, since this will appear just below the full method signature.

Suggested change
/// [`Vidpf::eval_with_cache`] evaluates the entire `input` and produces a share of the
/// Evaluates the entire `input` and produces a share of the

let second_prefix = VidpfInput::from_bools(&[false]);
let third_prefix = VidpfInput::from_bools(&[true]);
let mastic = Mastic::new(algorithm_id, szk, sum_vidpf, 32);
let first_agg_param = MasticAggregationParam::new(vec![first_prefix], true).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some test cases covering require_weight_check = false.


/// Mastic prepare share.
///
/// Broadcast message from an aggregator between rounds of Mastic. Includes the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a suggested rephrasing

Suggested change
/// Broadcast message from an aggregator between rounds of Mastic. Includes the
/// Broadcast message from an aggregator during preparation. Includes the

None
};
let (prep_share, prep_state) =
if let Some((szk_query_share, szk_query_state)) = szk_verify_opt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code could be further simplified by fusing this if-else and the if-else above. szk_verify_opt is constructed by the previous if-else block, and immediately destructured here, with no other uses

result.push(
self.szk
.typ
.decode_result(&self.szk.typ.truncate(encoded_result.to_vec())?[..], 1)?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we need to add the counter prefix to the encoded measurement, from sharding through to here. Once that's addressed in a follow-up, I think we'll want convenience constructors for Mastic with particular circuits, to take care of constructing the Szk and Vidpf objects, and calculating the VIDPF weight parameter from szk.typ.output_len() + 1.

result.push(
self.szk
.typ
.decode_result(&self.szk.typ.truncate(encoded_result.to_vec())?[..], 1)?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that the spec currently calls truncate during preparation, before producing an output share, as opposed to during unsharding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants